CDCA7-associated global aberrant DNA hypomethylation translates to localized, tissue-specific transcriptional responses

Disruption of cell division cycle associated 7 (CDCA7) has been linked to aberrant DNA hypomethylation, but the impact of DNA methylation loss on transcription has not been investigated. Here, we show that CDCA7 is critical for maintaining global DNA methylation levels across multiple tissues in vivo. A pathogenic Cdca7 missense variant leads to the formation of large, aberrantly hypomethylated domains overlapping with the B genomic compartment but without affecting the deposition of H3K9 trimethylation (H3K9me3). CDCA7-associated aberrant DNA hypomethylation translated to localized, tissue-specific transcriptional dysregulation that affected large gene clusters. In the brain, we identify CDCA7 as a transcriptional repressor and epigenetic regulator of clustered protocadherin isoform choice. Increased protocadherin isoform expression frequency is accompanied by DNA methylation loss, gain of H3K4 trimethylation (H3K4me3), and increased binding of the transcriptional regulator CCCTC-binding factor (CTCF). Overall, our in vivo work identifies a key role for CDCA7 in safeguarding tissue-specific expression of gene clusters via the DNA methylation pathway.


INTRODUCTION
DNA methylation is a well-known epigenetic modification associated with transcriptional repression and is pivotal for normal development (1).Genome-wide aberrant methylation is increasingly being reported in developmental disorders (2).Links between DNA methylation alterations and gene expression changes have been widely explored; however, the relation, significance, and functional relevance of these alterations remain poorly understood.
Recessive missense mutations in CDCA7 (cell division cycle associated 7) can cause immunodeficiency, centromeric instability, facial anomalies (ICF) syndrome (3), a genetically heterogeneous disorder characterized by reduced levels or absence of antibodies, facial dysmorphisms, and neurodevelopmental delay (4).In the blood of patients with ICF carrying CDCA7 missense mutations, aberrant DNA hypomethylation of CpG-poor genomic regions and (peri)centromeric satellite repeat sequences has been observed (3,5).CDCA7 knockdown and knockout studies showed that CDCA7 disruption reduces DNA methylation levels at (peri)centromeric satellite repeats using mouse (3,6) and human cell lines (7).We recently reported an association between CDCA7 expression levels and trans methylation changes at transcriptionally repressed sites in the blood of healthy individuals (8).In vitro studies identified an interaction between CDCA7 and the chromatin remodeler helicase lymphoid-specific (HELLS/Lsh) in HEK293 cells (7) and Xenopus egg extract (9).When mutated in humans, HELLS also causes ICF syndrome (3), and patients with CDCA7 and HELLS ICF show overlapping DNA methylation defects in the blood (5).Thus, several lines of evidence indicate that CDCA7 is a previously unrecognized factor required for DNA methylation regulation.However, how ICF3 pathogenic CDCA7-associated methylation alterations translate to gene expression changes in vivo has not been investigated.
In this study, we generated mice carrying an ICF syndrome causing CDCA7 missense variant and used the Cdca7 G305V mouse model as a tool to comprehensively investigate the epigenetic and transcriptional consequences of CDCA7 disruption in vivo.Overall, our data uncover that CDCA7 is a transcriptional repressor with a role in safeguarding tissue-specific expression of gene clusters via DNA methylation.
We first assessed the viability of Cdca7 G305V mutants.We found that offspring of all genotypes from heterozygous intercrosses were present at normal Mendelian frequencies at mid-gestation [embryonic day 14.5 (E14.5)] and at postnatal day 21 (P21) (Fig. 1B and fig.S1B).While, at P21, Cdca7 G305V homozygotes did not show overt phenotypes, we observed that homozygous females, on average, weighed less than wild type (WT) (t test, P < 0.0013) and heterozygotes (t test, P < 0.0005), while no differences in variance (F test, P < 0.39 and P < 0.9, respectively) were found (fig.S1C).However, in this study, we did not pursue any further in-depth phenotyping.
We then measured CDCA7 expression at P21 in tissues representing different germ layers by Western blot and quantitative reverse-transcriptase polymerase chain reaction (RT-qPCR) and found higher CDCA7 levels in proliferative tissues (spleen and thymus) of WT animals (Fig. 1C and fig.S1D).We detected no statistically significant differences in mRNA and no notable differences in protein levels in Cdca7 G305V homozygotes when compared to WT and heterozygous littermates (Fig. 1D and fig.S1E), indicating that the G305V substitution does not influence CDCA7 levels in spleen and thymus.Next, we performed cellular fractionation assays on E10.5 embryo lysates from WT and Cdca7 G305V homozygotes to determine the localization of CDCA7 in the cell.In WT, we detected CDCA7 protein in both the cytoplasm and the nucleus with the highest levels in the chromatin-bound fraction.In Cdca7 G305V homozygotes, we observed lower amounts of CDCA7 protein in the chromatin-bound fraction (Fig. 1E), indicating that the ICF3 pathogenic G305V substitution in the CXXC-ZF domain interferes with CDCA7 chromatin association in vivo.

CDCA7 influences the variegated expression of a transgene array in the spleen via DNA methylation
We previously used Line3 in an in vivo chemical mutagenesis screen to identify modifiers of transgene array silencing where mutants are classified according to the Drosophila position affect variegation nomenclature, into Suppressors or Enhancers of variegation [Su(var) or E(var)] based on their ability to increase or decrease the percentage of green fluorescent protein (GFP)-expressing erythrocytes, respectively (10).From this screen, DNA methyltransferase 3 beta (DNMT3B), DNMT1, and ubiquitin-like with PHD and ring finger domains 1 (UHRF1) have been recovered (11)(12)(13), demonstrating its suitability to identify key DNA methylation factors.Mutant alleles are referred to as Modifiers of murine metastable epialleles Dominant (MommeD) and, in the WT Line3, the GFP transgene array is expressed in a variegated manner in ~50% of red blood cells, a result of stochastic epigenetic silencing that occurs in cells of the same type (10).Flow cytometry analyses on embryonic spleen revealed a significant increase in the percentage of erythrocytes expressing GFP in Cdca7 G305V homozygotes when compared to WT (average GFP values for WT = 46% and homozygotes = 63%), consistent with a Su(var) phenotype (Fig. 1F and fig.S2).Sanger bisulfite sequencing showed that increased GFP expression was accompanied by lower transgene DNA methylation levels in Cdca7 G305V homozygotes (average methylation levels for WT = 49% and homozygotes ~37%) (Fig. 1F).We further found that increased Gfp mRNA levels [RNA sequencing (RNA-seq)] correlated with transgene hypomethylation (whole-genome bisulfite sequencing, WGBS; 5-methylcytosine, 5mC) in P21 Cdca7 G305V mutant spleen (Fig. 1G), consistent with our finding in embryos.These results show that the Cdca7 G305V missense mutation increases the probability of a red blood cell expressing GFP by altering transgene DNA methylation levels and suggests that CDCA7 functions as a transcriptional repressor.

CDCA7 is required to maintain global DNA methylation levels in multiple tissues
As aberrant genome-wide DNA hypomethylation has been found in the blood of patients with ICF carrying CDCA7 missense mutations (5), we next used the luminometric methylation assay (LUMA), a method that relies on the presence of 5mCpG-sensitive restriction enzyme sites (14), to measure global methylation levels in P21 WT and Cdca7 G305V homozygotes.We found significantly reduced global methylation levels in tissues from different germ layers in the mutants, with up to ~20% difference between WT and Cdca7 G305V homozygotes (Fig. 2A).Southern blot analysis revealed minor satellite repeat hypomethylation in Cdca7 G305V mutant spleens when compared to WT.In addition, we found reduced methylation levels of intracisternal A particle (IAP) retrotransposons, while major satellite repeat methylation levels did not show major differences between WT and Cdca7 G305V homozygotes (Fig. 2B).This pattern of repeat hypomethylation in Cdca7 G305V mutants was found in both tissues with high (i.e., spleen and thymus) and low (i.e., liver) CDCA7 expression levels (Fig. 2B and fig.S3A).This effect on DNA methylation was not the result of the differential expression of key methylation machinery members in the spleen and thymus (Fig. 2C and fig.S3, B and C).Furthermore, cellular fractionation followed by Western blot in WT and Cdca7 G305V homozygous embryos revealed no obvious differences in cellular localization of DNMT1, UHRF1, and LSH/HELLS using this bulk assay, while global DNA methylation levels were reduced in the homozygotes when measured by LUMA (Fig. 2D).In addition, minor satellite repeats and IAP elements showed hypomethylation in embryos, as observed in P21 Cdca7 G305V mutant thymus and spleen (fig.S3D).Together, these results show that CDCA7 is critical to maintaining global DNA methylation levels across tissues and developmental stages in mice.
A closer manual inspection and UCSC genome browser visualization indicated that methylation loss in Cdca7 G305V homozygous spleens occurred over large domains and overlapped regions with predicted late replication timing, partial DNA methylation, and lamina-associated domains (LADs), which are found in many cell types throughout normal development (Fig. 3D and fig.S4D) (15)(16)(17)(18)(19).These observations therefore prompted us to examine whether CDCA7-associated methylation alterations affected genomic compartments rather than specific loci (Fig. 3E).Plotting methylation levels over previously defined constitutive LAD (cLAD) regions (18) revealed a ~30% methylation reduction in Cdca7 G305V homozygotes.We found the same trend of reduced DNA methylation levels over common partially methylated domains (PMDs) that also overlap with cLADs (17) and genes embedded in PMDs (Fig. 3E and fig.S5A).On the other hand, the degree of methylation over highly methylated domains (HMDs) was only slightly affected (Fig. 3E).Next, we asked if previously identified solo-WCGW sites (17) are also susceptible to the Cdca7 G305V mutation.Our analysis showed that the Cdca7 G305V mutation had a greater impact on solo-WCGW located in PMDs compared to those found in HMDs.On average, solo-WCGW in PMDs lost ~40% methylation while at solo-WCGW in HMDs methylation was reduced by an average of ~5% (Fig. 3F and fig.S5B).It has been reported that Sine B1 (B1) and Line1 (L1) retrotransposons can be used to predict the A (active) and B (inactive) compartments of the genome, respectively (20).Plotting methylation levels over mouse B1 and L1 consensus sequences and ±3 kb of the surrounding regions revealed that methylation was more severely reduced (~30%) over L1 (average ~69% in WT versus ~39% in homozygotes) compared to B1 elements (~12%) (average ~72% in WT versus ~60% in homozygotes) in the mutants (Fig. 3G).Furthermore, using a previously reported list of B1-and L1-enriched genes (21), we found that L1-associated genes were more affected by methylation loss [average ~67% (WT) versus ~46% (mutants)] in Cdca7 G305V homozygotes than B1-associated genes [average ~66% (WT) versus ~63% (mutants)] (Fig. 3G).Notably, analysis of 17 curated imprinted control regions (ICRs) revealed that they were not affected by Cdca7 G305V -associated hypomethylation in the spleen (fig.S4C).Combined, these analyses show that CDCA7 preferably promotes methylation at partially methylated cytosines that reside in the B genomic compartment.
Aberrant DNA hypomethylation is associated with the up-regulation of lowly expressed genes in the spleen of Cdca7 G305V homozygotes Next, we carried out RNA-seq on WT and Cdca7 G305V homozygous P21 spleens (n = 3 per genotype).Differential gene expression analysis identified 10 significantly differentially expressed genes (DEGs) between WT and Cdca7 G305V mutants when using Padj ≤ 0.05 and log 2 FC ≥ 1.5 as cutoffs (Fig. 4A, fig.S6A, and data S1).All 10 genes were up-regulated and include two members of large gene clusters, MAS-related GPR, member A4 (Mrgpra4), and vomeronasal 2, receptor 96 (Vmn2r96), linked to pruriception (22) and pheromone reception (23), respectively, and poorly annotated protein-coding genes (Gm2617 and Gm5734 and a cluster comprising Gm16028, Gm15433, and Gm7609).In addition, mRNA levels of C-type lectin domain family 4 member G (Clec4g), which belongs to the C-type lectin-like receptors that play a role in the immune response (24), and olfactomedin 4 (Olfm4), a member of the olfactomedin family that has roles in cell adhesion (25), were increased (Fig. 4A, fig.S6,  A and B, and data S1).
We then compared gene expression with DNA methylation changes.Overall, transcriptionally silent genes were most affected by Cdca7 G305V -associated hypomethylation (Fig. 4B), in agreement with our observation that CDCA7 preferably promotes DNA methylation of L1-enriched genes (Fig. 3G).Furthermore, the majority of mappable DEGs (n = 7/8) also covered by WGBS were located in regions that lost methylation (~30%) in Cdca7 G305V mutants (fig.S6C).However, a closer manual inspection also revealed that six of eight DEGs already showed some low-level expression (threshold set to an average of ≥5 reads) in WT (fig.S6, B and D, and data S1).Thus, although we found that CDCA7 is necessary to maintain global DNA methylation levels in the spleen, the transcriptional responses to aberrant hypomethylation in Cdca7 G305V homozygotes were largely confined to the up-regulation of a small set of lowly expressed genes.Notably, the silencing of TE elements was also not perturbed in the Cdca7 G305V mutants in our bulk RNA-seq analysis (fig.S6E).

CDCA7 promotes DNA methylation at H3K9me3 marked genomic sites but is dispensable for H3K9me3 deposition in spleen
The mild transcriptional effect despite the relatively large methylation differences observed in Cdca7 G305V homozygotes suggests that silencing can be maintained by alternative mechanisms.The B genomic compartment is enriched for the repressive histone mark H3K9me3, and DNA methylation and H3K9me3 can be positively correlated (2).Therefore, we next investigated whether CDCA7 disruption also influenced H3K9me3 distribution and performed chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq) in the P21 spleen.We found that the majority of H3K9me3 peaks (72.3 and 72.8% out of n = 10,656 consensus peaks called for WT and Cdca7 G305V , respectively) mapped to intergenic regions, where they often overlap with repetitive elements (fig.S7A).Intersecting WGBS and H3K9me3 ChIP-seq datasets revealed reduced DNA methylation at H3K9me3-enriched sites in Cdca7 G305V mutant spleens (Fig. 4C).Notably, the effect of the Cdca7 G305V mutation on deposition of H3K9me3 was minimal with <1% [n = 29; false discovery rate (FDR) ≤ 0.05] of peaks showing a differential enrichment between WT and Cdca7 G305V spleens (fig.S7B and data S1).H3K9me3 enrichment in Cdca7 G305V homozygous spleens was unchanged at known H3K9me3 sites including endogenous retroviruses (ERVs) (Fig. 4D and fig.S7C) and developmentally silenced genes ( 26) such as the cytochrome P450 (Fig. 4E) and zinc finger and SCAN domain containing 4 (Zscan4) gene clusters (fig.S7D), suggesting that reduced DNA methylation levels in Cdca7 G305V mutants did not influence the repressive chromatin environment at these sites.Combined, these data suggest that CDCA7 is dispensable for the deposition of the heterochromatin mark H3K9me3 genomewide and that the residual DNA methylation in combination with H3K9me3 is sufficient to maintain transcriptional repression of H3K9me3 target sites in Cdca7 G305V mutant spleen.

Aberrant DNA hypomethylation is accompanied by increased H3K27me3 at gene clusters in Cdca7 G305V homozygous spleen
In addition to intergenic regions, promoters lost DNA methylation in Cdca7 G305V spleens (Fig. 3C).While a negative correlation between promoter methylation and the repressive histone modification H3K27 trimethylation (H3K27me3) has been reported (27), it has also been shown that H3K27me3 distribution can be altered in DNA methylation mutants (28,29).Therefore, we next profiled H3K27me3 using ChIP-seq in WT and Cdca7 G305V mutant P21 spleen (n = 2 per genotype).The majority of peaks (66 and 63%, respectively in WT and Cdca7 G305V mutants) overlapped with promoters (fig.S8A).We identified a total of 310 H3K27me3 differentially enriched peaks (FDR ≤ 0.05) between WT and Cdca7 G305V spleens.A total of 175 peaks were associated with H3K27me3 gain and 135 peaks were associated with H3K27me3 loss in the mutants, and, the differentially enriched peaks 52% (n = 92) and 63% (n = 85), respectively, were annotated as promoters [transcriptional start site (TSS) ± 1.5 kb] (Fig. 4F, fig.S8B, and data S1).A heatmap representation further revealed that the gain of H3K27me3 in Cdca7 G305V mutants occurred at hypomethylated sites, whereas reduced H3K27me3 was independent of DNA methylation alterations (fig.S8C).We obtained a similar result when investigating promoterassociated (TSS ± 1.5 kb) peaks, which covered 571 promoters (Fig. 4G and data S1).Notably, around one-third of the peaks (34%, 59 of 175) where H3K27me3 gain is accompanied by methylation loss mapped to gene clusters and the transition to H3K27me3 extended over long (up to 217 kb) stretches, encompassing promoters and gene bodies.Notable examples include the clustered protocadherin (Pcdh) genes (Fig. 4H and fig.S9A), olfactory receptor, keratin cluster, and vomeronasal receptor genes (fig.S9B).Integrating gene expression and ChIP-seq data further showed that genes with H3K27me3 gain remained transcriptionally silent in Cdca7 G305V mutants (Fig. 4I).From these data, we conclude that genome-wide H3K27me3 distribution is altered in Cdca7 G305V mutant spleen and that higher enrichment of H3K27me3-associated factors at hypomethylated promoters could contribute to the sustained repression of the associated genes.
Clustered protocadherin gene expression and chromatin state are dysregulated in Cdca7 G305V homozygous embryonic brain It has previously been reported that PMDs are enriched for gene clusters with brain-specific expression (30).Considering that we did not see large transcriptional deregulation in CDCA7 mutant spleens, we wondered if this was due to the fact that this tissue lacks transactivating factors required to take advantage of the aberrantly hypomethylated state.We therefore studied this in the E14.5 cerebrum where CDCA7 is expressed (fig.S10A), using droplet-based single-nucleus (sn) RNA-seq (n = 2 per genotype).After stringent filtering, a total of 25,923 nuclei were retained, then integrated and analyzed together (Fig. 5A).The expression of known neuronal marker genes (31)  When averaged across all four samples, most cells (59%) were annotated as excitatory and inhibitory neurons.In addition, we detected proliferating cells (30%) and neuronal progenitors (9.6%) with high concordance between all four samples (fig.S11C).These results suggest that CDCA7 is dispensable for early neurogenesis, and we found no obvious phenotypic differences between WT and Cdca7 G305V homozygous mutant brains (fig.S12A).Accordingly, pseudo-bulk transcriptome analyses (based on the snRNA-seq data) on the different cell populations identified only small numbers of DEGs (DESeq-2, Log 2 FC > 1.5, Padj < 0.01).Of note, proliferating cells and inhibitory neurons showed a mild, yet significant down-regulation of cell cycle and chromosome segregation genes, e.g., abnormal spindle-like, microcephaly-associated (Aspm), and Centrosomal Protein 55 (Cep55) which have been linked to craniofacial development (32,33).All upregulated genes in the Cdca7 G305V cerebrum belonged to the clustered Pcdh genes and they showed increased expression in all cell populations (Fig. 5B and data S2).
We next investigated whether transcriptional dysregulation was reflected at the chromatin level using bulk-level native-ChIP (NChIP)seq for the active histone mark H3K4me3 and the repressive modification H3K27me3, in E14.5 cerebrum.While the overall distribution of H3K4me3 was relatively unaffected by the Cdca7 G305V mutation, we identified 21 differentially enriched peaks, 17 of which mapped to Pcdha isoforms and Pcdhb genes (Fig. 5C, fig.S12B, and data S1), in agreement with their increased expression observed by snRNA-seq (Fig. 5B).In addition, we found localized perturbation of H3K27me3 deposition in Cdca7 G305V homozygous mutants.Genome-wide Miami plot visualization identified the cPcdh and homeobox (Hox) gene clusters to be among the loci with the most significant H3K27me3 gain and loss in Cdca7 G305V mutant cerebrum, respectively (Fig. 5D and fig.S12C), while the overall effect of CDCA7 disruption on H3K27me3 levels was modest and no difference in global H3K27me3 levels was observed (fig.S12D).Together, these data show that the transcriptional consequences of Cdca7 G305V -associated ~16% global methylation loss in E14.5 cerebrum, as observed by LUMA (Fig. 5E), are largely confined to the cPcdh genes where they are accompanied by changes in chromatin state and gene expression (Fig. 5F and fig.S12E).

CDCA7 is a modifier of clustered protocadherin alpha stochastic promoter choice
In the mouse, the Pcdha gene cluster consists of 12 variables and 2 constitutive isoforms (Fig. 6A), and combinatorial expression of individual variable isoforms in single neurons has been suggested to generate a barcode critical for neuronal diversity and network formation (34).Previous studies have estimated that, depending on neuronal type and developmental stage, in individual neurons between one and five variable Pcdha isoforms are expressed (35)(36)(37)(38)(39)(40), likely the result of stochastic epigenetic processes in cells of the same type.
At the molecular level, both H3K9me3 (41,42) and DNA methylation pathways (35,43) have been implicated in the regulation of Pcdha expression and stochastic isoform choice.Using Sanger bisulfite sequencing, we found reduced Pcdha promoter DNA methylation levels in Cdca7 G305V mutant cerebrum at all variable isoforms tested (Fig. 6B and fig.S13A), consistent with a role for CDCA7 in DNA methylation regulation.We also considered the effect of the Cdca7 G305V substitution at the R1 site, where SET domain bifurcated 1 (Setdb1)-mediated H3K9me3 has previously been shown to influence cPcdh expression (41).We observed no substantial differences in H3K9me3 levels between WT and Cdca7 G305V mutants at the R1 site when using ChIP-qPCR (fig.S13B) or at global levels (fig.S13C).This suggests that this site is epigenetically controlled by a pathway that does not require CDCA7 and is in agreement with our finding that CDCA7 disruption does not affect H3K9me3 deposition in the spleen.However, profiling of the entire Pcdh locus or genome-wide will be required to exclude Cdca7 G305V -associated changes in H3K9me3 enrichment elsewhere.
Next, we examined if the altered Pcdha chromatin environment influenced the mechanism of variable isoform choice in the Cdca7 G305V homozygotes.Since our snRNA-seq experiment was carried out using the Chromium 5′-library, this allowed us to analyze expression frequencies of variable Pcdha isoforms in the different cell types because we can distinguish sequencing reads from most, except Pcdha10/11, individual Pcdha cluster members (Fig. 6A).We detected nuclei expressing Pcdha isoforms in all annotated cell types including neuronal progenitors in both WT and Cdca7 G305V mutants (Fig. 6C).In WT, we found that isoforms positioned at the 5′ end of the gene cluster (i.e., Pcdha1 and a2) showed an overall lower expression frequency than isoforms located at the 3′ end (i.e., Pcdha12) (fig.S14A).In addition, we noticed that the Pcdha3, -4, and -6 isoforms were expressed at a higher frequency (fig.S14A), which could suggest their favored use in the E14.5 cerebrum.Heatmaps revealed that this pattern of Pcdha isoform expression was disrupted in the Cdca7 G305V cerebrum in all cell types analyzed (fig.S14B).The rarely used Pcdha1, a8, and several additional variable isoforms became expressed at significantly higher frequencies in the mutants (Fig. 6D, fig.S14, B and C, and data S2), concomitant with DNA methylation loss (Fig. 6B) and a gain in H3K4me3 (Fig. 5F).This pattern of increased expression likelihood was also reflected at the single cell level.We observed significantly higher numbers of nuclei expressing Pcdha isoforms and increased numbers of isoforms per individual neuronal cell in Cdca7 G305V homozygotes (up to n = 6) when compared to WT (up to n = 4), in all assigned cell types (Fig. 6E, fig.S14D, and data S2).We then measured occupancy of the transcriptional regulator CTCF at variable Pcdha isoforms, since Pcdha promoters and the HS5-1 enhancer contain CTCF motifs (44)(45)(46).ChIP-qPCR experiments revealed increased CTCF binding in Cdca7 G305V mutant cerebrum at variable Pcdha isoforms (Fig. 6F) that also showed increased expression frequency (Fig. 6D) and DNA hypomethylation (Fig. 6B and fig.S13A).This is consistent with previous reports that CTCF binding directly correlates with Pcdha isoform expression (43,44,46).
Pcdha promoter DNA methylation is acquired in the developing embryo (35,47) and relies on DNMT3B activity (35,48).Recessive mutations in CDCA7 or DNMT3B can cause ICF syndrome (3,(49)(50)(51)(52).Therefore, we considered a potential connection between the two factors.By Western blot, we detected both CDCA7 and DNMT3B protein in E7.5 to E10.5 single embryos (Fig. 7A).We then measured DNA methylation levels of the Pcdha1 and Pcdha8 promoters by Sanger bisulfite sequencing and found reduced methylation in the Cdca7 G305V mutants at E7.5 (Fig. 7B).To corroborate a potential Pcdh cluster DNA methylation defect in Cdca7 G305V mutants, we performed an adaptive sampling target enrichment experiment using Oxford Nanopore Technologies (ONT) long-read sequencing (Fig. 7C).We focused on investigating DNA methylation of the Pcdh cluster genes and included representative germline genes [maelstrom spermatogenic transposon silencer (Mael) and ribosomal protein L10 like (Rpl10l)], imprinted genes [potassium voltage-gated channel subfamily q member 1 (Kcnq1) and paternally expressed 13 (Peg13)], and Cdca7 as control loci (Fig. 7, D to F, and figs.S15 to S20).We observed reduced methylation over the Pcdha and Pcdhb genes, while Pcdhg members did not seem to be affected by the Cdca7 G305V mutation (Fig. 7D and figs.S15 to S17), consistent with our findings in the spleen (Fig. 4H and fig.S9A).In agreement with results from patients with CDCA7/ICF3 (5), germline gene promoter methylation was not impaired in the mutant embryo (Fig. 7E and fig.S18).Furthermore, we found that methylation of DNMT1-dependent (53) germline differentially methylated region (DMR) of the two imprinted loci tested was preserved in the Cdca7 G305V homozygote (Fig. 7F and fig.S19).These results show that CDCA7-associated aberrant cPcdh gene DNA hypomethylation is present during a developmental window when DNMT3B is expressed.However, since DNMT1 is required to maintain DNA methylation at E8.5 genome-wide (53), further investigation will be necessary to determine how the DNMTs and CDCA7 are connected.Combined, our in vivo data show that global aberrant DNA hypomethylation can translate to localized changes in chromatin state and gene expression and suggest that CDCA7 promotes DNA methylation at gene clusters where this epigenetic mark is used as a modulator of expression likelihood.

DISCUSSION
The number of reports where genome-wide defects in DNA methylation patterning are observed in individuals with germline mutations in epigenetic regulators is increasing.An important question that needs to be addressed when such widespread aberrant DNA methylation patterns are observed in developmental disease is how methylation alterations translate to gene expression changes.ICF syndrome is an exemplar disorder where abnormal DNA hypomethylation has been suggested to underlie disease phenotypic aspects (54).Using a mouse model carrying a germline ICF3/CDCA7 syndrome pathogenic variant, we show that CDCA7 is critical to maintaining global DNA methylation levels across multiple tissues and find that it functions as a transcriptional repressor by promoting DNA methylation of predominantly the B genomic compartment.CDCA7 disruption leads to the formation of large aberrantly hypomethylated domains,     without affecting the deposition of the heterochromatin mark H3K9me3 in the spleen.Notably, ICF3 pathogenic CDCA7-associated aberrant hypomethylation translated to tissue-specific transcriptional dysregulation that affected distinct sets of genes.Deregulation correlated with the activity of the locus, rather than aberrant hypomethylation per se, thereby providing an intriguing example of how global DNA methylation alterations can result in localized, tissue-specific gene expression changes (Fig. 8).However, a current limitation of this model and our study is that DNA methylation levels were profiled in bulk, in tissues with mixed cell types.Therefore, we currently cannot exclude that partial methylation loss is a result of cellular heterogeneity.Homozygous CDCA7 disruption does not affect the viability of mice up to 3 weeks of age.This phenotype was unexpected and makes CDCA7 stand out from other ICF-relevant HELLS/LSH (55) or DNMT3B knockout mouse models (51), which result in perinatal or embryonic lethality.Perhaps the closest resemblance can be seen between Cdca7 and two previously described hypomorphic Dnmt3b mouse models (13,56), as well as hypomorphic HELLS and DNMT1 mouse models (57,58), where at least partial homozygous viability to adulthood has been reported.Possible explanations for Cdca7 G305V homozygous viability may be related to the global methylation reduction of ~15% in the tissues tested, which is lower than what is, for instance, observed in embryonic lethal DNMT1 (59) or UHRF1 (60) knockout models.Furthermore, the characteristics of the aberrantly hypomethylated regions (located in the partially methylated, gene-poor, transcriptionally silent B genomic compartment) (Fig. 3, D and G) and the mild, tissue-specific, disruption of gene expression patterns that we observed in the P21 spleen and E14.5 cerebrum (Figs.4A and 5B) could explain viability.There may also be functional redundancy between CDCA7 and its paralog cell division cycle associated 7-like (CDCA7L) at some genomic sites since we have implicated both factors in DNA methylation regulation in humans (8).Notably, PMDs are a major target of agingrelated DNA methylation alterations (17,61), and premature aging has been show to occur in HELLS hypomorphic mutants (57).Therefore, a more extensive phenotyping of the Cdca7 G305V line will be required to reveal the full extent of transcriptional dysregulation and any phenotypes that may emerge during adulthood.

of 20
Our protein subcellular fractionation experiments revealed that CDCA7 is mainly bound to chromatin in vivo.The introduction of the ICF3 pathogenic G305V substitution largely abolished its chromatin localization (Fig. 1E).This is in agreement with published in vitro data, using frog-specific embryonic CDCA7 (CDCA7e) from Xenopus egg extract and a different set of ZF-located ICF3 missense mutations (9).The same study reported that missense mutations in the ZF-domain of CDCA7e lead to reduced association of HELLS with chromatin in vitro.However, in vivo, we observed no visible reduction of HELLS protein in the chromatin-bound fraction in Cdca7 G305V homozygous mutants (Fig. 2D).Possible explanations for the observed differences could be that different experimental systems were used, in vitro versus in vivo or that our current assay is not sensitive enough.We also found that two other key factors of the maintenance machinery, UHRF1 and DNMT1, remain detectable in the chromatin-bound fraction in Cdca7 G305V mutant embryos, at amounts comparable to WT (Fig. 2D).While UHRF1 and DNMT1 are responsible for DNA methylation maintenance genome-wide (2), our study points to a specific link between CDCA7 and DNA methylation maintenance of the B genomic compartment that overlaps with heterochromatic, late-replicating regions.At these sites, DNA methylation is markedly reduced in Cdca7 G305V homozygotes (Fig. 3).It is therefore possible that only a fraction of cellular UHRF1 and DNMT1 are associated with CDCA7.
Fig. 8. Model of CDCA7-mediated DNA methylation utilization at two stochastically expressed loci. the multicopy GFP transgene and the variable Pcdha promoters are characterized by mosaic dnA methylation, which is associated with variegated transgene expression in erythrocytes and influences stochastic Pcdha isoform choice in neuronal cells, respectively.cdcA7 disruption leads to hypomethylation and an increase in the probability of a cell expressing GFP in erythrocytes and Pcdha isoforms in neurons.
A recent evolutionary analysis could trace back CDCA7 and HELLS to the last eukaryotic common ancestor (62), suggesting conserved functions in DNA methylation pathways.We find that CDCA7 predominantly targets the B genomic compartment (Fig. 3,  E and G), where its function seems to converge with HELLS (63).How this specificity is achieved remains to be determined.Xenopus CDCA7e contains a CXXC-ZF domain that can bind DNA in vitro (9).One possibility could be that CDCA7 occupies these sites via direct DNA binding.It is also possible that CDCA7 engages in protein-protein interactions to promote DNA methylation.At latereplicating regions, DNA methylation is maintained by replicationuncoupled maintenance (64).In human cell lines, this process has been shown to rely on the interaction of the UHRF1 tandem tudor domain with methylated H3K9 and the chromatin remodeling activity of HELLS/LSH (65), and HELLS/LSH enhances UHRF1 chromatin association (66).Therefore, CDCA7 could participate in replication-uncoupled methylation maintenance through its interaction with HELLS/LSH (7,9,64).However, the exact mechanisms of how CDCA7 connects with its target sites and participates in the maintenance of DNA methylation remain to be determined.
We provide base-pair resolution DNA methylation maps for CDCA7 and show that CDCA7 promotes methylation beyond the previously described minor satellite repeats in mice (Fig. 3C).We find that CDCA7 influences methylation levels of cytosines located at H3K9me3-decorated sites (Fig. 4C), in agreement with previous observations in humans (8), and suggesting that this role is conserved between species.We found that the heterochromatic mark H3K9me3 was maintained at CDCA7 aberrantly hypomethylated regions (Fig. 4E) and ERV superfamily transposable elements (TEs) in the spleen (Fig. 4D and fig.S7C).TEs remained transcriptionally silent (fig.S6E).In this respect, CDCA7 differs from previously reported HELLS mouse mutant cell lines, where reduced levels of both marks have been observed and linked to transcriptional dysregulation of ERV superfamily TEs (28,63).Currently, we do not know the mechanisms underlying these differences and whether they are specific to certain tissues or cell types.It could be that HELLS can associate with the H3K9me3 pathway independently of CDCA7 and that loss of both marks is required for TE derepression.At the same time, we cannot rule out differences in the severity of CDCA7 and HELLS hypomethylation defects at TEs and possibly other genomic sites in the mouse, which could influence H3K9me3 levels and transcriptional responses to DNA methylation loss.
We observed an altered H3K27me3 distribution over promoters and gene bodies in Cdca7 G305V mutant spleen and cerebrum (Fig. 4F and fig.S12D), where H3K27me3 gain correlated with aberrant hypomethylation (Fig. 4G).These results are consistent with previous observations in DNMT1 (29), HELLS (28,63), and UHRF1 (67) mutant mouse models, where H3K27me3 gain at hypomethylated regions and loss of H3K27me3 at prominent Polycomb target genes such as the Hox clusters have been described.However, unlike the DNMT1 and UHRF1 models (29, 67), we do not observe the upregulation of Hox cluster gene expression in Cdca7 G305V mutants (Figs. 4A and 5B).This could be explained by the more subtle DNA methylation and H3K27me3 changes observed in Cdca7 G305V mutants.H3K27me3 gain did not occur at all aberrantly hypomethylated sites in the spleen, and we also did not identify any obvious features at genomic sites where H3K27me3 was lost in the Cdca7 G305V mutants.This suggests that the distribution of these marks relies on a multilayered regulation, in line with a previous report (29).
Our study identifies the clustered protocadherin genes as prominent targets of ICF3 pathogenic CDCA7-associated aberrant DNA hypomethylation in the spleen and cerebrum (Figs.4H and 6B).Notably, we observe transcriptional dysregulation exclusively in the Cdca7 G305V mutant cerebrum (Figs.4A and 5B), where this locus is normally active (36).This suggests that methylation loss per se is not sufficient to support transcription.Likely, the activity of Pcdh gene cluster members needs to be tightly controlled by several layers of epigenetic regulation to achieve cell type-specific stochastic expression.Consistent with this hypothesis are previous studies by others and us, which identified the cPcdh genes as targets of the de novo DNA methyltransferase DNMT3B (35), the human silencing hub (HUSH) complex (41,42) and the cohesion/CTCF complex member widely interspaced ZF motifs (WIZ) (68,69) via in vivo genetic approaches.Furthermore, CTCF is critical for cPcdh regulation by mediating promoter-enhancer interactions (43,70).In neuronal cells, we find increased CTCF binding at aberrantly hypomethylated Pcdha promoters, which correlated with increased Pcdha isoform expression in Cdca7 G305V mutants (Fig. 6).These results are consistent with a role for CTCF in regulation of clustered protocadherin gene expression (45,71) and with previous reports of an inverse relationship between DNA methylation and reduced CTCF binding at imprinted genes (72,73), variably methylated IAP elements (74), and the Pcdha cluster (41).Besides, we observed an increase in the proportion of neuronal cells expressing Pcdha isoforms and individual neuronal cells expressed a higher number of variable isoforms (Fig. 6, D and E).Cell-to-cell variation in Pcdha DNA methylation levels and CTCF binding could potentially explain how higher numbers of Pcdha isoforms become activated in individual neuronal cells in Cdca7 G305V mutants.In sum, although it remains to be determined whether individual Cdca7 G305V mutant neurons exhibit changes in Pcdha promoter-enhancer interactions and we cannot exclude additional roles for CDCA7, it seems possible that DNA methylation is used to limit CTCF binding to specific Pcdha members and thereby safeguards isoform diversity (stochastic choice) in individual neurons.
While of the three Pcdh gene clusters, Pcdha and Pcdhb follow a similar pattern of dysregulation (aberrant hypomethylation and H3K27me3 gain in the spleen and H3K27me3 gain and isoform choice dysregulation in cerebrum) in the Cdca7 G305V mutants, the Pcdhg cluster appears hardly affected at both the epigenetic and transcriptional level (figs.S9A and S12E).In addition to CDCA7, the disruption of WIZ (only the Pcdhb genes are dysregulated) or the structural maintenance of chromosomes flexible hinge domain containing 1 (SMCHD1) (up-regulation of Pcdha and Pcdhb cluster members but hardly any effect on the Pcdhg cluster) differentially affects the regulation of the cPcdh genes (68,75).Furthermore, recent work investigating histone modification dynamics at the cPcdh genes during mouse embryonic brain development found differences in H3K27me3 deposition (76).While H3K27me3 is present over the Pcdhg cluster from E10.5-P0, this mark appears to be lost from the Pcdha cluster around E13.5 (76).Together, this suggests that although the Pcdha and Pcdhg clusters share a similar genomic organization (77), their epigenetic regulation differs, as we observe in the Cdca7 G305V mutants.
ICF syndrome is a genetically heterogeneous disease and can be caused by mutations in DNMT3B (49,51,52), ZF and BTB domain containing 24 (ZBTB24) (78,79), CDCA7 (3), HELLS (3), or UHRF1 (80).Our data suggest that CDCA7 functionally overlaps in the regulation of clustered Pcdh gene expression with at least two other ICF syndrome-causing factors, the de novo DNA methyltransferase DNMT3B and the chromatin remodeler HELLS, such that their disruption leads to aberrant hypomethylation and increased expression of Pcdh isoforms in neuronal tissue (35) or mouse embryonic fibroblasts (MEFs) (63), respectively.This is intriguing since the PCDHA promoters are among the few loci found to be aberrantly hypomethylated in the blood of patients with ICF carrying DNMT3B, ZBTB24, CDCA7, HELLS, or UHRF1 mutations (5,80).Considering that the clustered PCDH genes are critical for generating neuronal diversity (81,82) and their suggested links to neurological disease (83,84), clustered PCDH dysregulation could be a potential contributor to the neurodevelopmental symptoms including intellectual disability that have been described in patients with ICF (3).

Ethics statement
All procedures involving animals were approved by the Animal Ethics Committee of the Leiden University Medical Center and by the Commission of Biotechnology in Animals of the Dutch Ministry of Agriculture.

Embryo dissections
All embryos were produced by natural timed matings.Noon of the day that the vaginal plug was detected was considered E0.5.

DNA isolation
gDNA was isolated using the salt-extraction method.Briefly, tissues were lysed in cell lysis buffer [50 mM tris-HCl (pH 8), 4 mM EDTA (pH 8), and 2% SDS] plus proteinase K (390973P, VWR) and incubated at 55°C overnight.Cell lysate was treated with ribonuclease A (EN0531, Thermo Fisher Scientific) at 37°C for 1 hour.Saturated NaCl buffer was added, followed by the addition of isopropanol to precipitate gDNA and washing with 70% ethanol (EtOH).gDNA was dissolved in water and concentration was measured using NanoDrop.

RNA isolation and RT-qPCR
Total RNA was isolated with QIAzol (5346994, Qiagen).About 1 μg of total RNA was used for reverse transcription using RevertAid H Minus First Strand cDNA Synthesis Kit (K1632, Thermo Fisher Scientific).RT-qPCR was performed in triplicate on a C1000TM Thermal cycler (Bio-Rad) with SYBR Green (170-8887, Bio-Rad).Expression data were normalized to βactin and CFX manager (Bio-Rad) was used for data analyses and GraphPad for visualization.Primer sequences are provided in table S1.

Subcellular protein fractionation
Subcellular protein fractionation was performed according to the manufacturer's instructions (87790, Thermo Fisher Scientific).Briefly, frozen tissues, stored at −80 after dissection, were pooled by genotype and homogenized using a Dounce homogenizer (12 strokes with A pestle followed by 6 to 12 strokes with B pestle).E10.5 tissue was further processed for 50 mg weight according to the protocol (87790, Thermo Fisher Scientific).A BCA kit (23225, Thermo Fisher Scientific) was used to measure the protein concentration of each fraction.

Sanger bisulfite sequencing
One microgram of gDNA was bisulfite converted using the Zymo Lightning Kit (D5030, BaseClear) according to the manufacturer's instructions.PCR products were amplified using HotStarTaq DNA polymerase (87722, Thermo Fisher Scientific), cloned into the TOPO TA cloning kit in pCR2.1 (450641, Thermo Fisher Scientific), and transformed into Escherichia coli DH5α.DNA from individual colonies was Sanger-sequenced.Sequences with a conversion rate of at least 96% were analyzed using BiQ Analyzer software (86).Lollipop figures were made using the Quantification tool for Methylation Analysis (QUMA) (87).Primer sequences are provided in table S1.

Southern blot
Southern blotting was used to assess DNA methylation levels at satellite repeats and IAPs.Briefly, 1 μg of gDNA was digested with HpaII (methylation sensitive) or MspI (methylation insensitive) (ER0511 and RR0541, Thermo Fisher Scientific).Blotting was done to a Hybond-XI membrane (GERPN203S, GE Healthcare) for 6 hours, and ultraviolet-cross-linked to the membrane with Strategene 1200 using auto cross-link setting (1200 μJ).Hybridization was done overnight at 65°C.The filter was exposed and analyzed on a Molecular Devices Storm 840 Scanner and Image Processor.Primer sequences for the generation of the 32 P-labeled probes can be found in table S1.
Whole-genome bisulfite sequencing gDNA was isolated as described above and WGBS was performed at BGI (samples were sequenced on HiSeq X Ten).Raw sequences were filtered by quality with TrimGalore using default parameters.Reads with lengths smaller than 20 bps and error rates higher than 0.1 were discarded.Sequences were aligned to mm10 (transgene sequence was included in mm10 FASTA file) using the Bismark aligner v0.18 (90).To increase sensitivity, we used parameter "-N 1", as well as -gzip -bowtie2.Duplicates were removed with Deduplicate Bismark (Bismark package).Methylation calling was performed by Bismark Methylation Extractor, using default parameters with the following exceptions: "-paired-end, " "-ignore_r2 2, " and "-bedgraph." Correlation was calculated and plots were made using the Methylkit R package (91).Differentially methylated CpGs were calculated using DSS R package v2.3 (Bioconductor package) with the following parameters P < 0.05 and alpha > 0.15, and, subsequently, the annotation was done with ChIPseeker (92).For violin plots, the genome was divided in 1-kb windows [bedtools (93) was used to create 1-kb bins] and average methylation for each bin per genotype was calculated, then the bins were annotated using ChIPseeker, after which ggplot was used to create violin plots.Mouse (mm10) soloCpG (WCGW), HMD, and PMD annotations were obtained from previous publications (17) and bedtools was used for the intersection of the datasets.To visualize the distribution of methylation levels for soloCpGs located in HMDs and PMDs, Python libraries seaborn (v0.12.2) and matplotlib (v3.6.3) were used.
ONT long-read sequencing adaptive sampling gDNA was isolated as described above.DNA concentration was determined by Qubit using the Qubit High Sensitivity Kit, and size was determined on a Femto Pulse System (Agilent Technologies).ONT sequencing libraries were prepared using the gDNA ligation kit (SQK-LSK114) and sequenced using the vendor's instructions.Briefly, 600 ng of gDNA was used as input and yielded approximately 20 fmol of the library to load on a PromethION Flowcell (FLO-PRO114M) and sequenced on a PromethION 24 device (A100).Adaptive sampling in MinKNOW (v22.10.7) was enabled during the sequencing of these samples using fast basecalling and a BED file was provided for the regions of interest while mm10 of the mouse genome was used as a reference for the enrichment.After 20 hours, the run yielded around 400k reads which accounts for 3.53 Gb with an N50 of 28 kb.The basecalling with modification was done after the adaptive sampling run was completed using Guppy version 6.3.9 with model dna_ r10.4.1_e8.2_400bps_modbases_5mc_cg_sup_prom.cfg.The BAM file containing the fastq reads and the modifications were mapped against mm10 using SAMtools (version 1.13) with operation fastq and flags -TMM,ML and piped to Minimap2 (version 2.24) where no secondary mapping was allowed.The output results in a sam file which also contains the modification information.The sam files were converted to BAM files, sorted and indexed using SAMtools.Methylartist (88) was used to visualize methylation patterns across various genomic regions, using the commands 'locus' and 'region' .Default parameters for smooth window size were used and are indicated in the figure panels.

Genome-wide analysis of DNA methylation and histone marks at repetitive elements
To generate profile plots over repetitive elements, deepTools (97) and the UCSC RepeatMasker track (mm10) were used.For the heatmaps, coverage and BAM files generated during the analyses of the WGBS and H3K9me3 ChIP-seq data were used.X and Y chromosomes were filtered using SAMtools (v1.10) and all samples were down-sampled to maintain 25 M reads per sample using the function DownsampleSam from Picard (v2.18.7).For heatmaps, methylation percentage at individual repeats was measured by overlapping the methylation value with repeat masker coordinates using the function bedtools intersect (v2.30) (93).H3K9me3 coverage was measured over individual TEs extracted from the UCSC repeat masker using the function bedtools multicov (v2.30).Heatmaps showing the average percentage of methylation and H3K9me3 enrichment over input at TEs of interest were generated in R using pheatmap (v1.0.12) and dplyr (v0.8.3) packages.

Fig. 2 .
Fig. 2. CDCA7 is required to maintain global DNA methylation levels in multiple tissues.(A) Bar chart showing dnA methylation levels in different tissues of P21 animals measured by lUMA.three biological replicates per genotype [error bars = SeM; *P < 0.05, **P < 0.005, and ***P < 0.0005 (unpaired t test)].(B) Southern blots showing minor satellite, intracisternal A particle (iAP), and major satellite repeat methylation levels in the P21 spleen.Genomic dnA was digested with Hpaii (H; methylation sensitive) or Mspi (M; methylation insensitive).two or three biological replicates per genotype.(C) Western blot showing dnMt1, UhRF1, and hellS protein levels in P21 spleen; two biological replicates per genotype.tUBUlin or VincUlin were used as loading controls.(D) left: Western blot showing dnMt1, hellS, and UhRF1 protein levels in ce, Me, Sne, and cB isolated from e10.5 (four pooled embryos per genotype).tUBUlin and h3 were used as loading controls.Right: Bar chart showing dnA methylation levels in e10.5 embryos measured by lUMA; two biological replicates per genotype.

Fig. 3 .Fig. 4 .
Fig. 3. CDCA7 primarily mediates DNA methylation of the B genomic compartment in the spleen.(A) Boxplots slowing global methylation levels measured by WGBS.each boxplot indicates one biological replicate [middle line indicates median, box limits indicate upper and lower quartiles, and whiskers extend to 1.5× interquartile range (iQR) from quartiles].(B) Middle: Stacked bar chart showing proportions of differentially methylated or unchanged cpGs in Cdca7 G305V homozygotes.Proportions of hypomethylated (left) or hypermethylated (right) cpG sites with respect to their genomic annotation.(C) Violin plots showing WGBS methylation levels at different genomic features (average methylation levels over 1-kb tiles).Average of two biological replicates per genotype.(D) Genome browser view depicting methylation profiles over part of chromosome 2; two replicates per genotype.Replication chiP-seq data (encFF001JUt) and Bed tracks for previously defined common partially methylated domains (PMds) and highly methylated domains (hMds) (17) and clAds (GSe17051) and RefSeq annotations are shown.Yellow shading, representative hypomethylated domains; gray rectangle, representative gene poor hypomethylated region (zoom-in in fig.S4d, top); dark orange rectangle, representative hypomethylated gene cluster (zoom-in in fig.S4d, bottom).(E) top: Active A (green) and inactive B (blue) genomic compartments of the nucleus are illustrated.Profile plots showing methylation levels ±3 kb over clAds, (GSe17051).Bottom: common hMds and PMds (17).Methylation levels were calculated over 50 bp for clAds, and 10-bp bins for common PMds and hMds.numbers indicate a number of regions over which average methylation was calculated.(F) Violin plots showing methylation at previously defined solo-WcGWs (17) located in common hMds or PMds; two biological replicates per genotype.the number indicates the number of cpGs plotted (fig.S5B).(G) Profile plots showing methylation levels ±3 kb over (top) l1 and B1 elements (calculated over 10-bp bins) and (bottom) over previously defined l1-and Sine-enriched genes (21) (calculated over 10-bp bins).

Fig. 5 .
Fig. 5. Dysregulation of clustered protocadherin gene expression and chromatin state in Cdca7 G305V homozygous cerebrum.(A) Uniform Manifold Approximation and Projection (UMAP) representations of e14.5 cerebrum integrated snRnA-seq data (n = 25,923 nuclei).left UMAP: the colors represent four different samples grouped by genotype.Right: integrated dataset colored according to the four annotated cell types.A small number of nuclei (n = 201 in one Wt and n = 104 in one Cdca7 G305V/G305V biological replicate) that could not be assigned to a specific cell type were designated "unknown" and were not included in the subsequent analyses.(B) MA plots showing differential RnA levels between Wt and Cdca7 G305V homozygous e14.5 cerebrum for the different cell types.Red, significantly up-regulated; blue, significantly downregulated genes (Padj ≤ 0.01 and −1.5 ≤ log 2 Fc ≤ 1.5).horizontal dashed lines represent log 2 Fc thresholds of 1.5 and −1.5; n = 2 biological replicates per genotype.(C) heatmap showing average h3K4me3 levels at 21 differentially enriched peaks in Wt and Cdca7 G305V homozygous e14.5 cerebrum (diffBind, default settings -deseq2, FdR ≤ 0.05).(D) Miami plot visualization of h3K27me3 chiP-seq differential analysis using sicer_df showing (top) regions that gain h3K27me3 (Cdca7 G305V /Wt) and (bottom) regions that lose h3K27me3 (Wt/Cdca7 G305V ).All top-scoring dots on chromosome 18 correspond to the clustered Pcdh locus (highlighted in yellow).All four Hox clusters lose h3K27me3 as indicated (triangles indicate out-of-the-range values).(E) Bar chart showing dnA methylation levels in e14.5 cerebrum measured by lUMA.three biological replicates per genotype [error bar = SeM; ***P < 0.0005 (unpaired t test)].(F) Genome browser screenshot of the clustered Pcdha locus.Representative tracks for snRnA-seq (en, excitatory neurons; in, inhibitory neurons) and chiP-seq (h3K4me3 and h3K27me3) from e14.5 cerebrum are shown (rectangles below h3K4me3 and h3K27me3 tracks, called peaks; RefSeq annotation is shown below the tracks).

Fig. 7 .
Fig. 7. CDCA7 is required for Pcdh cluster methylation in the embryo.(A) Western blot showing dnMt3B and cdcA7 protein levels in e7.5, e8.5, and e10.5 whole Wt embryos; VincUlin and tUBUlin were used as loading controls.(B) dnA methylation levels of Pcdha1 and Pcdha8 promoters measured by Sanger bisulfite sequencing in Wt and Cdca7 G305V homozygous e7.5 whole embryos (filled circles, methylated cytosines; empty circle, unmethylated cytosines).(C) Schematic representation of pipeline for adaptive sampling with Oxford nanopore technology, where only regions of interest, provided within a Bed file, are sequenced.(D) Methylation levels over the Pcdh cluster, (E) the Mael and IIdr2 promoter regions, and (F) the Kcnq1 imprinted region obtained from Ont long-read sequencing.From top to bottom, the figure panels show (i) the genomic position of interest, (ii) a diagram displaying the correspondence between genome space and cpG space, and (iii) the smoothed methylated fraction plot.